Terminological paraphrase extraction from scientific literature based on predicate argument tuples

نویسندگان

  • Sung-Pil Choi
  • Sung-Hyon Myaeng
چکیده

Terminological paraphrases (TPs) are sentences or phrases that express the concepts of terminologies in a different form. Here we propose an effective way to identify and extract TPs from large-scale scientific literature databases. We propose a novel method for effectively retrieving sentences that contain a given terminological concept based on semantic units called predicate-argument tuples. This method enables effective textual similarity computations and minimized errors based on six TP ranking models. For evaluation, we constructed an evaluation collection for the TP recognition task by extracting TPs from a target literature database using the proposed method. Through the two experiments, we learned that scientific literature contain many TPs that could not have been identified so far. Also, the experimental results showed the potential and extensibility of our proposed methods to extract the TPs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scientific Literature Retrieval based on Terminological Paraphrases using Predicate Argument Tuple

The conceptual condensability of technical terms permits us to use them as effective queries to search scientific databases. However, authors often employ alternative expressions to represent the meanings of specific terms, in other words, Terminological Paraphrases (TPs) in the literature for certain reasons. In this paper, we propose an effective way to retrieve “de facto relevance documents”...

متن کامل

Paraphrase Detection Based on Identical Phrase and Similar Word Matching

Paraphrase detection has numerous important applications in natural language processing (such as clustering, summarizing, and detecting plagiarism). One approach to detecting paraphrases is to use predicate argument tuples. Although this approach achieves high paraphrase recall, its accuracy is generally low. Other approaches focus on matching similar words, but word meaning is often contextual...

متن کامل

Aligning Predicate-Argument Structures for Paraphrase Fragment Extraction

Paraphrases and paraphrasing algorithms have been found of great importance in various natural language processing tasks. While most paraphrase extraction approaches extract equivalent sentences, sentences are an inconvenient unit for further processing, because they are too specific, and often not exact paraphrases. Paraphrase fragment extraction is a technique that post-processes sentential p...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Using Repeated Patterns across Comparable Articles for Paraphrase Acquisition

We focus on paraphrases for information extraction: expressions which should produce the same extraction output. These expressions are acquired automatically from comparable news articles (articles from the same day, on the same topic). Candidate paraphrases are paths in predicate argument structure starting from matching anchors (typically, names) in the two sentences. By using such syntactica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Information Science

دوره 38  شماره 

صفحات  -

تاریخ انتشار 2012